Search CORE

54 research outputs found

Two new Probability inequalities and Concentration Results

Author: Kannan Ravindran
Publication venue
Publication date: 21/05/2010
Field of study

Concentration results and probabilistic analysis for combinatorial problems like the TSP, MWST, graph coloring have received much attention, but generally, for i.i.d. samples (i.i.d. points in the unit square for the TSP, for example). Here, we prove two probability inequalities which generalize and strengthen Martingale inequalities. The inequalities provide the tools to deal with more general heavy-tailed and inhomogeneous distributions for combinatorial problems. We prove a wide range of applications - in addition to the TSP, MWST, graph coloring, we also prove more general results than known previously for concentration in bin-packing, sub-graph counts, Johnson-Lindenstrauss random projection theorem. It is hoped that the strength of the inequalities will serve many more purposes.Comment: 3

arXiv.org e-Print Archive

CiteSeerX

Clustering with Spectral Norm and the k-means Algorithm

Author: Kannan Ravindran
Kumar Amit
Publication venue
Publication date: 01/01/2010
Field of study

There has been much progress on efficient algorithms for clustering data points generated by a mixture of

k

probability distributions under the assumption that the means of the distributions are well-separated, i.e., the distance between the means of any two distributions is at least

\Omega(k)

standard deviations. These results generally make heavy use of the generative model and particular properties of the distributions. In this paper, we show that a simple clustering algorithm works without assuming any generative (probabilistic) model. Our only assumption is what we call a "proximity condition": the projection of any data point onto the line joining its cluster center to any other cluster center is

\Omega(k)

standard deviations closer to its own center than the other center. Here the notion of standard deviations is based on the spectral norm of the matrix whose rows represent the difference between a point and the mean of the cluster to which it belongs. We show that in the generative models studied, our proximity condition is satisfied and so we are able to derive most known results for generative models as corollaries of our main result. We also prove some new results for generative models - e.g., we can cluster all but a small fraction of points only assuming a bound on the variance. Our algorithm relies on the well known

k

-means algorithm, and along the way, we prove a result of independent interest -- that the

k

-means algorithm converges to the "true centers" even in the presence of spurious points provided the initial (estimated) centers are close enough to the corresponding actual centers and all but a small fraction of the points satisfy the proximity condition. Finally, we present a new technique for boosting the ratio of inter-center separation to standard deviation

arXiv.org e-Print Archive

CiteSeerX

Spectral Approaches to Nearest Neighbor Search

Author: Abdullah Amirali
Andoni Alexandr
Kannan Ravindran
Krauthgamer Robert
Publication venue
Publication date: 04/08/2014
Field of study

We study spectral algorithms for the high-dimensional Nearest Neighbor Search problem (NNS). In particular, we consider a semi-random setting where a dataset

P

\mathbb{R}^d

is chosen arbitrarily from an unknown subspace of low dimension

k\ll d

, and then perturbed by fully

d

-dimensional Gaussian noise. We design spectral NNS algorithms whose query time depends polynomially on

d

and

\log n

(where

n=|P|

) for large ranges of

k

d

and

n

. Our algorithms use a repeated computation of the top PCA vector/subspace, and are effective even when the random-noise magnitude is {\em much larger} than the interpoint distances in

P

. Our motivation is that in practice, a number of spectral NNS algorithms outperform the random-projection methods that seem otherwise theoretically optimal on worst case datasets. In this paper we aim to provide theoretical justification for this disparity.Comment: Accepted in the proceedings of FOCS 2014. 30 pages and 4 figure

arXiv.org e-Print Archive

CiteSeerX

Random Separating Hyperplane Theorem and Learning Polytopes

Author: Bhattacharyya Chiranjib
Kannan Ravindran
Kumar Amit
Publication venue
Publication date: 21/07/2023
Field of study

The Separating Hyperplane theorem is a fundamental result in Convex Geometry with myriad applications. Our first result, Random Separating Hyperplane Theorem (RSH), is a strengthening of this for polytopes. \rsh asserts that if the distance between

a

and a polytope

K

with

k

vertices and unit diameter in

\Re^d

is at least

\delta

, where

\delta

is a fixed constant in

(0,1)

, then a randomly chosen hyperplane separates

a

and

K

with probability at least

1/poly(k)

and margin at least

\Omega \left(\delta/\sqrt{d} \right)

. An immediate consequence of our result is the first near optimal bound on the error increase in the reduction from a Separation oracle to an Optimization oracle over a polytope. RSH has algorithmic applications in learning polytopes. We consider a fundamental problem, denoted the ``Hausdorff problem'', of learning a unit diameter polytope

K

within Hausdorff distance

\delta

, given an optimization oracle for

K

. Using RSH, we show that with polynomially many random queries to the optimization oracle,

K

can be approximated within error

O(\delta)

. To our knowledge this is the first provable algorithm for the Hausdorff Problem. Building on this result, we show that if the vertices of

K

are well-separated, then an optimization oracle can be used to generate a list of points, each within Hausdorff distance

O(\delta)

K

, with the property that the list contains a point close to each vertex of

K

. Further, we show how to prune this list to generate a (unique) approximation to each vertex of the polytope. We prove that in many latent variable settings, e.g., topic modeling, LDA, optimization oracles do exist provided we project to a suitable SVD subspace. Thus, our work yields the first efficient algorithm for finding approximations to the vertices of the latent polytope under the well-separatedness assumption

arXiv.org e-Print Archive

Principal Component Analysis and Higher Correlations for Distributed Data

Author: Kannan Ravindran
Vempala Santosh
Woodruff David
Publication venue
Publication date: 29/06/2014
Field of study

We consider algorithmic problems in the setting in which the input data has been partitioned arbitrarily on many servers. The goal is to compute a function of all the data, and the bottleneck is the communication used by the algorithm. We present algorithms for two illustrative problems on massive data sets: (1) computing a low-rank approximation of a matrix

A=A^1 + A^2 + \ldots + A^s

, with matrix

A^t

stored on server

t

and (2) computing a function of a vector

a_1 + a_2 + \ldots + a_s

, where server

t

has the vector

a_t

; this includes the well-studied special case of computing frequency moments and separable functions, as well as higher-order correlations such as the number of subgraphs of a specified type occurring in a graph. For both problems we give algorithms with nearly optimal communication, and in particular the only dependence on

n

, the size of the data, is in the number of bits needed to represent indices and words (

O(\log n)

).Comment: rewritten with focus on two main results (distributed PCA, higher-order moments and correlations) in the arbitrary partition mode

arXiv.org e-Print Archive

CiteSeerX

Characterization of a distinct lethal arteriopathy syndrome in twenty-two infants associated with an identical, novel mutation in FBLN4 gene, confirms fibulin-4 as a critical determinant of human vascular elastogenesis

Author: Coucke Paul
De Paepe Anne
Faiyaz-Ul-Haque Muhammad
Kannan Rajesh
Kappanayil Mahesh
Kumar Krishna
Kurup Renu
Malfait Fransiska
Menon Swapna
Nampoothiri Sheela
Ravindran Hiran K
Renard Marjolijn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Background: Vascular elasticity is crucial for maintaining hemodynamics. Molecular mechanisms involved in human elastogenesis are incompletely understood. We describe a syndrome of lethal arteriopathy associated with a novel, identical mutation in the fibulin 4 gene (FBLN4) in a unique cohort of infants from South India. Methods: Clinical characteristics, cardiovascular findings, outcomes and molecular genetics of twenty-two infants from a distinct population subgroup, presenting with characteristic arterial dilatation and tortuosity during the period August 2004 to June 2011 were studied. Results: Patients (11 males, 11 females) presented at median age of 1.5 months, belonging to unrelated families from identical ethno-geographical background; eight had a history of consanguinity. Cardiovascular features included aneurysmal dilatation, elongation, tortuosity and narrowing of the aorta, pulmonary artery and their branches. The phenotype included a variable combination of cutis laxa (52%), long philtrum-thin vermillion (90%), micrognathia (43%), hypertelorism (57%), prominent eyes (43%), sagging cheeks (43%), long slender digits (48%), and visible arterial pulsations (38%). Genetic studies revealed an identical c.608A > C (p. Asp203Ala) mutation in exon 7 of the FBLN4 gene in all 22 patients, homozygous in 21, and compound heterozygous in one patient with a p. Arg227Cys mutation in the same conserved cbEGF sequence. Homozygosity was lethal (17/21 died, median age 4 months). Isthmic hypoplasia (n = 9) correlated with early death (<= 4 months). Conclusions: A lethal, genetic disorder characterized by severe deformation of elastic arteries, was linked to novel mutations in the FBLN4 gene. While describing a hitherto unreported syndrome in this population subgroup, this study emphasizes the critical role of fibulin-4 in human elastogenesis

Springer - Publisher Connector

Ghent University Academic Bibliography

PubMed Central

Algorithmic Geometry of Numbers

Author: Ravindran Kannan
Publication venue
Publication date: 01/01/1987
Field of study

this article - Algorithmic Geometry of Numbers. The fundamental basis reduction algorithm of Lov'asz which first appeared in Lenstra, Lenstra, Lov'asz [46] was used in Lenstra's algorithm for integer programming and has since been applied in myriad contexts-starting with factorization of polynomials (A.K. Lenstra, [45]). Classical Geometry of Numbers has a special feature in that it studies the geometric properties of (convex) sets like volume, width etc. which come from the realm of continuous mathematics in relation to lattices which are discrete objects. This makes it ideal for applications to integer programming and other discrete optimization problems which seem inherently harder than their "continuous" counterparts like linear programming.

CiteSeerX

OpenGrey Repository